Genomic sequence modification method for specifically converting nucleic acid bases of targeted dna sequence, and molecular complex for use in same

ABSTRACT

The invention provides a method of modifying a targeted site of a double stranded DNA, including a step of contacting a complex wherein a nucleic acid sequence-recognizing module that specifically binds to a target nucleotide sequence in a selected double stranded DNA and a nucleic acid base converting enzyme are linked, with the double stranded DNA, to convert one or more nucleotides in the targeted site to other one or more nucleotides or delete one or more nucleotides, or insert one or more nucleotides into the targeted site, without cleaving at least one strand of the double stranded DNA in the targeted site.

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application is a continuation of U.S. patent applicationSer. No. 15/124,021, filed Nov. 9, 2016; which is the U.S. nationalphase of International Patent Application No. PCT/JP2015/056436, filedMar. 4, 2015; which claims the benefit of Japanese Patent ApplicationNo. 2014-043348, filed on Mar. 5, 2014, and Japanese Patent ApplicationNo. 2014-201859, filed on Sep. 30, 2014, which are incorporated byreference in their entireties herein.

INCORPORATION-BY-REFERENCE OF MATERIAL ELECTRONICALLY SUBMITTED

Incorporated by reference in its entirety herein is a computer-readablenucleotide/amino acid sequence listing submitted concurrently herewithand identified as follows: 96.9 KB ASCII (Text) file named“150161_401C1_SEQ_LISTING.txt” created Mar. 30, 2020.

TECHNICAL FIELD

The present invention relates to a modification method of a genomesequence, which enables modification of a nucleic acid base in aparticular region of a genome, without cleaving double-stranded DNA(with no cleavage or single strand cleavage), and without inserting aforeign DNA fragment, and a complex of a nucleic acidsequence-recognizing module and a nucleic acid base converting enzymeused therefor.

BACKGROUND ART

In recent years, genome editing is attracting attention as a techniquefor modifying the target gene and genome region of interest in variousspecies. Conventionally, as a method of genome editing, a methodutilizing an artificial nuclease comprising a combination of a moleculehaving a sequence-independent DNA cleavage ability and a molecule havinga sequence recognition ability has been proposed (non-patent document1).

For example, a method of performing recombination at a target gene locusin DNA in a plant cell or insect cell as a host, by using a zinc fingernuclease (ZFN) wherein a zinc finger DNA binding domain and anon-specific DNA cleavage domain are linked (patent document 1); amethod of cleaving or modifying a target gene in a particular nucleotidesequence or a site adjacent thereto by using TALEN wherein atranscription activator-like (TAL) effector, which is a DNA bindingmodule that the plant pathogenic bacteria Xanthomonas has, and a DNAendonuclease are linked (patent document 2); a method utilizingCRISPR-Cas9 system wherein DNA sequence CRISPR (Clustered Regularlyinterspaced short palindromic repeats), that functions in an acquiredimmune system possessed by eubacterium and archaebacterium, and nucleaseCas (CRISPR-associated) protein family having an important functionalong with CRISPR are combined (patent document 3) and the like havebeen reported. Furthermore, a method of cleaving a target gene in thevicinity of a particular sequence, by using artificial nuclease whereina PPR protein configured to recognize a particular nucleotide sequenceby a series of PPR motifs each consisting of 35 amino acids andrecognizing one nucleic acid base, and nuclease are linked (patentdocument 4) has also been reported.

DOCUMENT LIST Patent Documents

-   patent document 1: JP-B-4968498-   patent document 2: National Publication of International Patent    Application No. 2013-513389-   patent document 3: National Publication of International Patent    Application No. 2010-519929-   patent document 4: JP-A-2013-128413-   non-patent document-   non-patent document 1: Kelvin M Esvelt, Harris H Wang (2013)    Genome-scale engineering for systems and synthetic biology,    Molecular Systems Biology 9: 641

SUMMARY OF THE INVENTION Problems to be Solved by the Invention

The genome editing techniques heretofore been proposed basicallypresuppose double-stranded DNA breaks (DSB). However, since they involveunexpected genome modifications, side effects such as strongcytotoxicity, chromosomal rearrangement and the like occur, and theyhave common problems of impaired reliability in gene therapy, extremelysmall number of surviving cells by nucleotide modification, anddifficulty in genetic modification itself in primate ovum andunicellular microorganisms.

Therefore, an object of the present invention is to provide a novelmethod of genome editing for modifying a nucleic acid base of aparticular sequence of a gene without DSB or insertion of foreign DNAfragment, i.e., by non-cleavage of a double stranded DNA or singlestrand cleavage, and a complex of a nucleic acid sequence-recognizingmodule and a nucleic acid base converting enzyme therefor.

Means of Solving the Problems

The present inventors have conducted intensive studies in an attempt tosolve the above-mentioned problems and taken note of adopting baseconversion by a conversion reaction of DNA base, without accompanyingDSB. The base conversion reaction by a deamination reaction of DNA baseis already known; however, targeting any site by recognizing aparticular sequence of DNA, and specifically modifying the targeted DNAby base conversion of DNA bases has not been realized yet.

Therefore, deaminase, that catalyzes a deamination reaction, was used asan enzyme for such conversion of nucleic acid bases, and linked to amolecule having a DNA sequence recognition ability, thereby a genomesequence was modified by nucleic acid base conversion in a regioncontaining a particular DNA sequence.

Specifically, CRISPR-Cas system (CRISPR-mutant Cas) was used. That is, aDNA encoding an RNA molecule, wherein genome specific CRISPR-RNA:crRNA(gRNA) containing a sequence complementary to a target sequence of agene to be modified is linked to an RNA for recruiting Cas protein(trans-activating crRNA: tracrRNA) was produced. On the other hand, aDNA wherein a DNA encoding a mutant Cas protein (dCas), wherein cleavageability of one or both strands of a double stranded DNA is inactivatedand a deaminase gene are linked, was produced. These DNAs wereintroduced into a host yeast cell which comprises a gene to be modified.As a result, mutation could be introduced randomly within the range ofseveral hundred nucleotides of the gene of interest including the targetsequence. Compared to when a double mutant Cas protein, which do notcleave both of DNA strands in the double stranded DNA, was used, themutation introduction efficiency increased when a mutant Cas proteinwhich cleave of either one of the strands was used. In addition, it wasclarified that the area of mutation region and variety of mutation varydepending on which of the DNA double strand is cleaved. Furthermore,mutation could be introduced extremely efficiently by targeting aplurality of regions in the target gene of interest. That is, a hostcell introduced with DNA was seeded in a nonselective medium, and thesequence of the target gene of interest was examined in randomlyselected colonies. As a result, introduction of mutation was confirmedin almost all colonies. It was also confirmed that genome editing can besimultaneously performed at a plurality of sites by targeting certainregion in two or more target genes of interest. It was furtherdemonstrated that the method can simultaneously introduce mutation intoalleles of diploid or polyploid genomes, can introduce mutation into notonly eukaryotic cells but also prokaryotic cells such as Escherichiacoli, and is widely applicable irrespective of species. It was alsofound that editing of essential gene, which showed low efficiencyheretofore, can be efficiently performed by transiently performing anucleic acid base conversion reaction at a desired stage.

The present inventors have conducted further studies based on thesefindings and completed the present invention.

Accordingly, the present invention is as described below.

-   [1] A method of modifying a targeted site of a double stranded DNA,    comprising a step of contacting a complex wherein a nucleic acid    sequence-recognizing module that specifically binds to a target    nucleotide sequence in a selected double stranded DNA and a nucleic    acid base converting enzyme are linked, with said double stranded    DNA, to convert one or more nucleotides in the targeted site to    other one or more nucleotides or delete one or more nucleotides, or    insert one or more nucleotides into said targeted site, without    cleaving at least one strand of said double stranded DNA in the    targeted site.-   [2] The method of [1], wherein the nucleic acid sequence-recognizing    module is selected from the group consisting of a CRISPR-Cas system    wherein at least one DNA cleavage ability of Cas is inactivated, a    zinc finger motif, a TAL effector and a PPR motif.-   [3] The method of [1], wherein the nucleic acid sequence-recognizing    module is a CRISPR-Cas system wherein at least one DNA cleavage    ability of Cas is inactivated.-   [4] The method of any of [1]-[3], which uses two or more kinds of    nucleic acid sequence-recognizing modules each specifically binding    to a different target nucleotide sequence.-   [5] The method of [4], wherein the different target nucleotide    sequence is present in a different gene.-   [6] The method of any of [1]-[5], wherein the nucleic acid base    converting enzyme is deaminase.-   [7] The method of the above-mentioned [6], wherein the deaminase is    AID (AICDA).-   [8] The method of any of [1]- [7], wherein the double stranded DNA    is contacted with the complex by introducing a nucleic acid encoding    the complex into a cell having the double stranded DNA.-   [9] The method of [8], wherein the cell is a prokaryotic cell.-   [10] The method of [8], wherein the aforementioned cell is a    eukaryotic cell.-   [11] The method of [8], wherein the cell is a cell of a    microorganism.-   [12] The method of [8], wherein the cell is a plant cell.-   [13] The method of [8], wherein the cell is an insect cell.-   [14] The method of [8], wherein the cell is an animal cell.-   [15] The method of [8], wherein the aforementioned cell is a cell of    a vertebrate.-   [16] The method of [8], wherein the cell is a mammalian cell.-   [17] The method of any of [9]- [16], wherein the cell is a polyploid    cell, and a site in any targeted allele on a homologous chromosome    is modified.-   [18] The method of any of [8]- [17], comprising a step of    introducing an expression vector comprising a nucleic acid encoding    the complex in a form permitting control of an expression period    into the cell, and a step of inducing expression of the nucleic acid    for a period necessary for stabilizing the modification of the    targeted site in the double stranded DNA.-   [19] The method of the above-mentioned [18], wherein the target    nucleotide sequence in the double stranded DNA is present in a gene    essential for the cell.-   [20] A nucleic acid-modifying enzyme complex wherein a nucleic acid    sequence-recognizing module that specifically binds to a target    nucleotide sequence in a selected double stranded DNA and a nucleic    acid base converting enzyme are linked, which converts one or more    nucleotides in the targeted site to other one or more nucleotides or    deletes one or more nucleotides, or inserts one or more nucleotides    into said targeted site, without cleaving at least one strand of    said double stranded DNA in the targeted site.-   [21] A nucleic acid encoding the nucleic acid-modifying enzyme    complex of [20].

Effect of the Invention

According to the genome editing technique of the present invention,since it is not associated with insertion of a foreign DNA ordouble-stranded DNA breaks, the technique is superior in safety. Thetechnique has some possibility of providing a solution in cases whereconventional methods were considered as a gene recombination, and thusbiologically or legally controversial. It is also theoretically possibleto set a wide range of mutation introduction from a pin point of onebase to several hundred bases, and the technique can also be applied tolocal evolution induction by introduction of random mutation into aparticular limited region, which has been almost impossible heretofore.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic illustration showing a mechanism of the geneticmodification method of the present invention using the CRISPR-Cassystem.

FIG. 2 shows the results of verification, by using a budding yeast, ofthe effect of the genetic modification method of the present inventioncomprising a combination of a CRISPR-Cas system and PmCDA1 deaminasefrom Petromyzon marinus.

FIG. 3 shows changes in the number of surviving cells after expressioninduction when a CRISPR-Cas9 system using a D10A mutant of Cas9 having anickase activity and a deaminase, PmCDA1, are used in combination (nCas9D10A-PmCDA1), and when conventional Cas9 having a DNA double strandcleavage ability is used.

FIG. 4 shows the results when a plurality of expression constructs areconstructed such that human AID deaminase and dCas9 are linked via SH3domain and a binding ligand thereof, wherein the express constructs areintroduced into a budding yeast together with two kinds of gRNA(targeting sequences of target 4 and target 5).

FIG. 5 shows that the mutation introduction efficiency is increased bythe use of Cas9 that cleaves either DNA single strand.

FIG. 6 shows that in the case where a double stranded DNA is notcleaved, the area of mutation introduction region and frequency thereofchange depending on which one of the single strands is cleaved.

FIG. 7 shows that extremely high mutation introduction efficiency can berealized by targeting two regions in proximity.

FIG. 8 shows that the genetic modification method of the presentinvention does not require selection by marker. It was found thatmutation was introduced into all colonies sequenced.

FIG. 9 shows that a plurality of sites in a genome can be simultaneouslyedited by the genetic modification method of the present invention. Theupper panel shows the nucleotide sequence and amino acid sequence of thetarget site of each gene, and an arrow on the nucleotide sequence showsthe target nucleotide sequence. The number at the arrow end or arrowhead indicates the position of the target nucleotide sequence terminuson ORF. The lower panel shows the results of sequencing of the targetsite in each 5 clones of red (R) and white (W) colonies. In thesequences, the nucleotides indicated with outline characters showoccurrence of base conversion. As for responsiveness to canavanine(Can^(R)), R shows resistance, and S shows sensitivity.

FIG. 10 shows that a mutation can be simultaneously introduced into bothalleles on the homologous chromosome of diploid genome by the geneticmodification method of the present invention. FIG. 10A shows homologousmutation introduction efficiency of Ade1 gene (upper panel) and can1gene respectively. FIG. 10B shows that homologous mutation was actuallyintroduced into red colony (lower panel). Also, occurrence ofheterologous mutation in white colony was shown (upper panel).

FIG. 11 shows that genome editing of Escherichia coli, a prokaryoticcell, is possible by the genetic modification method of the presentinvention. FIG. 11A is a schematic illustration showing the plasmidused. FIG. 11B shows that a mutation (CAA→TAA) could be efficientlyintroduced by targeting a region in the galK gene. FIG. 11C shows theresults of sequence analysis of each two clones of the respectivecolonies in a nonselective medium (none), a medium containing 25 μg/mlrifampicin (Rif25) or a medium containing 50 μg/ml rifampicin (Rif50).Introduction of a mutation imparting rifampicin resistance was confirmed(upper panel). The appearance frequency of rifampicin resistance strainwas estimated to be about 10% (lower panel).

FIG. 12 shows control of the edited base sites by the length of guideRNA. FIG. 12A is a conceptual Figure of editing base site when thelength of the target nucleotide sequence is 20 bases or 24 bases. FIG.12B shows the results of editing by targeting gsiA gene and changing thelength of the target nucleotide sequence. The mutated sites are shownwith bold letters, “T” and “A” show introduction of complete mutation(C→T or G→A) into the clone, “t” shows that not less than 50% ofmutation (C→T) is introduced into the clone (incomplete cloning), and“c” shows that the introduction efficiency of the mutation (C→T) intothe clone is less than 50%.

FIG. 13 is a schematic illustration showing a temperature sensitiveplasmid for mutation introduction, which was used in Example 11.

FIG. 14 shows the protocol of mutation introduction in Example 11.

FIG. 15 shows the results of introduction of mutation into the rpoB genein Example 11.

FIG. 16 shows the results of introduction of mutation into the galK genein Example 11.

DESCRIPTION OF EMBODIMENTS

The present invention provides a method of modifying a targeted site ofa double stranded DNA by converting the target nucleotide sequence andnucleotides in the vicinity thereof in the double stranded DNA to othernucleotides, without cleaving at least one strand of the double strandedDNA to be modified. The method characteristically comprises a step ofcontacting a complex wherein a nucleic acid sequence-recognizing modulethat specifically binds to the target nucleotide sequence in the doublestranded DNA and a nucleic acid base converting enzyme are linked, withthe double stranded DNA to convert the targeted site, i.e., the targetnucleotide sequence and nucleotides in the vicinity thereof, to othernucleotides.

In the present invention, the “modification” of a double stranded DNAmeans that a nucleotide (e.g., dC) on a DNA strand is converted toanother nucleotide (e.g., dT, dA or dG), or deleted, or a nucleotide ora nucleotide sequence is inserted between certain nucleotides on the DNAstrand. While the double stranded DNA to be modified is not particularlylimited, it is preferably a genomic DNA. The “targeted site” of a doublestranded DNA means the entire or partial “target nucleotide sequence”,which a nucleic acid sequence-recognizing module specifically recognizesand binds to, or the vicinity of the target nucleotide sequence (one orboth of 5′ upstream and 3′ downstream), and the length thereof can beappropriately adjusted between 1 base and several hundred basesaccording to the object.

In the present invention, the “nucleic acid sequence-recognizing module”means a molecule or molecule complex having an ability to specificallyrecognize and bind to a particular nucleotide sequence (i.e., targetnucleotide sequence) on a DNA strand. Binding of the nucleic acidsequence-recognizing module to a target nucleotide sequence enables anucleic acid base converting enzyme linked to the module to specificallyact on a targeted site of a double stranded DNA.

In the present invention, the “nucleic acid base converting enzyme”means an enzyme capable of converting a target nucleotide to anothernucleotide by catalyzing a reaction for converting a substituent on apurine or pyrimidine ring on a DNA base to another group or atom,without cleaving the DNA strand.

In the present invention, the “nucleic acid-modifying enzyme complex”means a molecular complex comprising a complex of the above-mentionednucleic acid sequence-recognizing module linked with a nucleic acid baseconverting enzyme, wherein the complex has nucleic acid base convertingenzyme activity and is imparted with a particular nucleotide sequencerecognition ability. The “complex” used herein encompasses not only onecomposed of a plurality of molecules, but also a single molecule havinga nucleic acid sequence-recognizing module and a nucleic acid baseconverting enzyme such as a fusion protein.

The nucleic acid base converting enzyme used in the present invention isnot particularly limited as long as it can catalyze the above-mentionedreaction, and examples thereof include deaminase belonging to thenucleic acid/nucleotide deaminase superfamily, which catalyzes adeamination reaction that converts an amino group to a carbonyl group.Preferable examples thereof include cytidine deaminase capable ofconverting cytosine or 5-methylcytosine to uracil or thymine,respectively, adenosine deaminase capable of converting adenine tohypoxanthine, guanosine deaminase capable of converting guanine toxanthine and the like. As cytidine deaminase, more preferred isactivation-induced cytidine deaminase (hereinafter also referred to asAID), which is an enzyme that introduces a mutation into animmunoglobulin gene in the acquired immunity of vertebrate or the like.

While the origin of nucleic acid base converting enzyme is notparticularly limited, for example, PmCDA1 (Petromyzon marinus cytosinedeaminase 1) from Petromyzon marinus, or AID (Activation-inducedcytidine deaminase; AICDA) from mammal (e.g., human, swine, bovine,horse, monkey etc) can be used. The base sequence and amino acidsequence of CDS of PmCDA1 are shown in SEQ ID NOs: 1 and 2,respectively, and the base sequence and amino acid sequence of CDS ofhuman AID are shown in SEQ ID NOs: 3 and 4, respectively.

A target nucleotide sequence in a double stranded DNA to be recognizedby the nucleic acid sequence-recognizing module in the nucleicacid-modifying enzyme complex of the present invention is notparticularly limited as long as the module specifically binds to anysequence in the double stranded DNA. The length of the target nucleotidesequence only needs to be sufficient for specific binding of the nucleicacid sequence-recognizing module. For example, when mutation isintroduced into a particular site in the genomic DNA of a mammal, it isnot less than 12 nucleotides, preferably not less than 15 nucleotides,more preferably not less than 17 nucleotides, according to the genomesize thereof. While the upper limit of the length is not particularlylimited, it is preferably not more than 25 nucleotides, more preferablynot more than 22 nucleotides.

As the nucleic acid sequence-recognizing module in the nucleicacid-modifying enzyme complex of the present invention, CRISPR-Cassystem wherein at least one DNA cleavage ability of Cas is inactivated(CRISPR-mutant Cas), zinc finger motif, TAL effector and PPR motif andthe like, as well as a fragment containing a DNA binding domain of aprotein that specifically binds to DNA such as restriction enzyme,transcription factor, RNA polymerase or the like, and not having a DNAdouble strand cleavage ability and the like can be used, but the moduleis not limited thereto. Preferably, the modules include CRISPR-mutantCas, zinc finger motif, TAL effector, PPR motif and the like.

A zinc finger motif is constructed by linking 3-6 different Cys2His2type zinc finger units (1 finger recognizes about 3 bases), and canrecognize a target nucleotide sequence of 9-18 bases. A zinc fingermotif can be produced by a known method such as Modular assembly method(Nat Biotechnol (2002) 20: 135-141), OPEN method (Mol Cell (2008) 31:294-301), CoDA method (Nat Methods (2011) 8: 67-69), Escherichia colione-hybrid method (Nat Biotechnol (2008) 26:695-701) and the like. Theabove-mentioned patent document 1 can be referred to as for the detailof the zinc finger motif production.

A TAL effector has a module repeat structure with about 34 amino acidsas a unit, and the 12th and 13th amino acid residues (called RVD) of onemodule determine the binding stability and base specificity. Since eachmodule is highly independent, TAL effector specific to a targetnucleotide sequence can be produced by simply linking the modules. ForTAL effector, production methods utilizing an open resource (REAL method(Curr Protoc Mol Biol (2012) Chapter 12: Unit 12.15), FLASH method (NatBiotechnol (2012) 30: 460-465), and Golden Gate method (Nucleic AcidsRes (2011) 39: e82) etc) have been established, and a TAL effector for atarget nucleotide sequence can be designed relatively easily. Theabove-mentioned patent document 2 can be referred to as for the detailof the production of TAL effector.

PPR motif is constructed such that a particular nucleotide sequence isrecognized by a series of PPR motifs each consisting of 35 amino acidsand recognizing one nucleic acid base, and recognizes a target base onlyby 1, 4 and ii(−2) amino acids of each motif. Motif configuration has nodependency, and is free of interference of motifs on both sides.Therefore, similar to TAL effector, a PPR protein specific to the targetnucleotide sequence can be produced by simply linking PPR motifs. Theabove-mentioned patent document 4 can be referred to as for the detailof the production of PPR motif.

When a fragment of a restriction enzyme, transcription factor, RNApolymerase or the like is used, since the DNA binding domains of theseproteins are well known, a fragment containing said domain and nothaving a DNA double strand cleavage ability can be easily designed andconstructed.

Any of the above-mentioned nucleic acid sequence-recognizing module canbe provided as a fusion protein with the above-mentioned nucleic acidbase converting enzyme, or a protein binding domain such as SH3 domain,PDZ domain, GK domain, GB domain and the like and a binding partnerthereof may be fused with a nucleic acid sequence-recognizing module anda nucleic acid base converting enzyme, respectively, and provided as aprotein complex via an interaction of the domain and a binding partnerthereof. Alternatively, a nucleic acid sequence-recognizing module and anucleic acid base converting enzyme may be each fused with intein, andthey can be linked by ligation after protein synthesis.

The nucleic acid-modifying enzyme complex of the present inventioncontaining a complex (including fusion protein), wherein a nucleic acidsequence-recognizing module and a nucleic acid base converting enzymeare linked, may be contacted with a double stranded DNA as an enzymereaction in a cell-free system. In view of the main object of thepresent invention, it is desirable to perform the contact by introducinga nucleic acid encoding the complex into a cell having the doublestranded DNA of interest (e.g., genomic DNA).

Therefore, the nucleic acid sequence-recognizing module and the nucleicacid base converting enzyme are preferably prepared as a nucleic acidencoding a fusion protein thereof, or as nucleic acids encoding each ofthem in a form capable of forming a complex in a host cell aftertranslation into a protein by utilizing a binding domain, intein or thelike. The nucleic acid here may be a DNA or an RNA. When it is a DNA, itis preferably a double stranded DNA, and provided in the form of anexpression vector disposed under regulation of a functional promoter ina host cell. When it is an RNA, it is preferably a single stranded RNA.

Since the complex of the present invention wherein a nucleic acidsequence-recognizing module and a nucleic acid base converting enzymeare linked, is not associated with double-stranded DNA breaks (DSB),genome editing with low toxicity is possible, and the geneticmodification method of the present invention can be applied to a widerange of biological materials. Therefore, the cells into which nucleicacid encoding nucleic acid sequence-recognizing module and/or nucleicacid base converting enzyme is introduced can encompass cells of anyspecies, from cells of microorganisms, such as bacterium,such asEscherichia coli and the like which are prokaryotes, such as yeast andthe like which are lower eukaryotes, to cells of higher eukaryotes suchas insect, plant and the like, and cells of vertebrates includingmammals such as human and the like.

A DNA encoding a nucleic acid sequence-recognizing module such as zincfinger motif, TAL effector, PPR motif and the like can be obtained byany method mentioned above for each module. A DNA encoding asequence-recognizing module of restriction enzyme, transcription factor,RNA polymerase and the like can be cloned by, for example, synthesizingan oligoDNA primer covering a region encoding a desired part of theprotein (part containing DNA binding domain) based on the cDNA sequenceinformation thereof, and amplifying by the RT-PCR method using, thetotal RNA or mRNA fraction prepared from the protein-producing cells asa template.

A DNA encoding a nucleic acid base converting enzyme can also be clonedsimilarly by synthesizing an oligoDNA primer based on the cDNA sequenceinformation thereof, and amplifying by the RT-PCR method using, thetotal RNA or mRNA fraction prepared from the enzyme-producing cells as atemplate. For example, a DNA encoding PmCDA1 of Petromyzon marinus canbe cloned by designing suitable primers for the upstream and downstreamof CDS based on the cDNA sequence (accession No. EF094822) registered inthe NCBI database, and cloning from mRNA Petromyzon marinus by theRT-PCR method. A DNA encoding human AID can be cloned by designingsuitable primers for the upstream and downstream of CDS based on thecDNA sequence (accession No. AB040431) registered in the NCBI database,and cloning from, for example, mRNA from human lymph node by the RT-PCRmethod.

The cloned DNA may be directly, or after digestion with a restrictionenzyme when desired, or after addition of a suitable linker and/or anuclear localization signal (each organelle transfer signal when thetarget double stranded DNA of interest is mitochondria or chloroplastDNA), ligated with a DNA encoding a nucleic acid sequence-recognizingmodule to prepare a DNA encoding a fusion protein. Alternatively, a DNAencoding a nucleic acid sequence-recognizing module, and a DNA encodinga nucleic acid base converting enzyme may be each fused with a DNAencoding a binding domain or a binding partner thereof, or both DNAs maybe fused with a DNA encoding a separation intein, whereby the nucleicacid sequence-recognizing conversion module and the nucleic acid baseconverting enzyme are translated in a host cell to form a complex. Inthese cases, a linker and/or a nuclear localization signal can be linkedto a suitable position of one of or both DNAs when desired.

A DNA encoding a nucleic acid sequence-recognizing module and a DNAencoding a nucleic acid base converting enzyme can be obtained bychemically synthesizing the DNA strand, or by linking partly overlappingsynthesized oligoDNA short strands by utilizing the PCR method and theGibson Assembly method to construct a DNA encoding the full lengththereof. The advantage of constructing a full-length DNA by chemicalsynthesis or a combination of PCR method or Gibson Assembly method isthat the codon used can be designed in CDS full-length according to thehost into which the DNA is introduced. In the expression of aheterologous DNA, the protein expression level is expected to increaseby converting the DNA sequence thereof to a codon which is highlyfrequently used in the host organism. As the data of codon use frequencyin host used, for example, the genetic code use frequency database(www.kazusa.or.jp/codon/index.html) disclosed in the home page of KazusaDNA Research Institute can be used, or documents showing the codon usefrequency in each host may be referred to. By reference to the obtaineddata and the DNA sequence to be introduced, codons showing low usefrequency in the host from those used for the DNA sequence may beconverted to a codon coding the same amino acid and showing high usefrequency.

An expression vector containing a DNA encoding a nucleic acidsequence-recognizing module and/or a nucleic acid base converting enzymecan be produced, for example, by linking the DNA to the downstream of apromoter in a suitable expression vector.

As the expression vector, plasmids from Escherichia coli (e.g., pBR322,pBR325, pUC12, pUC13); plasmids from Bacillus subtilis (e.g., pUB110,pTP5, pC194); plasmids from yeast (e.g., pSH19, pSH15); insect cellexpression plasmids (e.g., pFast-Bac); animal cell expression plasmids(e.g., pA1-11, pXT1, pRc/CMV, pRc/RSV, pcDNAI/Neo); bacteriophages suchas λ phage and the like; insect virus vectors such as baculovirus andthe like (e.g., BmNPV, AcNPV); animal virus vectors such as retrovirus,vaccinia virus, adenovirus and the like, are used.

As the promoter, any promoter appropriate for a host used for geneexpression can be used. In a conventional method involving DSB, sincethe survival rate of the host cell sometimes decreases markedly due tothe toxicity, it is desirable to increase the number of cells by thestart of the induction by using an inductive promoter. However, sincesufficient cell proliferation can also be achieved by expressing thenucleic acid-modifying enzyme complex of the present invention, aconstitutive promoter can also be used without limitation.

For example, when the host is an animal cell, SRa promoter, SV40promoter, LTR promoter, CMV (cytomegalovirus) promoter, RSV (Roussarcoma virus) promoter, MoMuLV (Moloney mouse leukemia virus) LTR,HSV-TK (simple herpes virus thymidine kinase) promoter and the like areused. Of these, CMV promoter, SRa promoter and the like are preferable.

When the host is Escherichia coli, trp promoter, lac promoter, recApromoter, λP_(L) promoter, Ipp promoter, T7 promoter and the like arepreferable.

When the host is genus Bacillus, SPO1 promoter, SPO2 promoter, penPpromoter and the like are preferable.

When the host is a yeast, Ga1/10 promoter, PHO5 promoter, PGK promoter,GAP promoter, ADH promoter and the like are preferable.

When the host is an insect cell, polyhedrin promoter, P10 promoter andthe like are preferable.

When the host is a plant cell, CaMV35S promoter, CaMV19S promoter, NOSpromoter and the like are preferable.

As the expression vector, besides those mentioned above, one containingenhancer, splicing signal, terminator, polyA addition signal, aselection marker such as drug resistance gene, auxotrophic complementarygene and the like, replication origin and the like on demand can beused.

An RNA encoding a nucleic acid sequence-recognizing module and/or anucleic acid base converting enzyme can be prepared by, for example,transcription to mRNA in an in vitro transcription system known per seby using a vector encoding DNA encoding the above-mentioned nucleic acidsequence-recognizing module and/or a nucleic acid base converting enzymeas a template.

A complex of a nucleic acid sequence-recognizing module and a nucleicacid base converting enzyme can be intracellularly expressed byintroducing an expression vector containing a DNA encoding a nucleicacid sequence-recognizing module and/or a nucleic acid base convertingenzyme into a host cell, and culturing the host cell.

As the host, genus Escherichia, genus Bacillus, yeast, insect cell,insect, animal cell and the like are used.

As the genus Escherichia, Escherichia coli K12- DH1 [Proc. Natl. Acad.Sci. USA, 60, 160 (1968)], Escherichia coli JM103 [Nucleic AcidsResearch, 9, 309 (1981)], Escherichia coli JA221 [Journal of MolecularBiology, 120, 517 (1978)], Escherichia coli HB101 [Journal of MolecularBiology, 41, 459 (1969)], Escherichia coli C600 [Genetics, 39, 440(1954)] and the like are used.

As the genus Bacillus, Bacillus subtilis M1114 [Gene, 24, 255 (1983)],Bacillus subtilis 207-21 [Journal of Biochemistry, 95, 87 (1984)] andthe like are used.

As the yeast, Saccharomyces cerevisiae AH22, AH22R⁻, NA87-11A, DKD-5D,20B-12, Schizosaccharomyces pombe NCYC1913, NCYC2036, Pichia pastorisKM71 and the like are used.

As the insect cell when the virus is AcNPV, cells of established linefrom cabbage armyworm larva (Spodoptera frugiperda cell; Sf cell), MG1cells from the mid-intestine of Trichoplusia ni, High Five™ cells froman egg of Trichoplusia ni, cells from Mamestra brassicae, cells fromEstigmena acrea and the like are used. When the virus is BmNPV, cells ofestablished line from Bombyx mori (Bombyx mori N cell; BmN cell) and thelike are used as insect cells. As the Sf cell, for example, Sf9 cell(ATCC CRL1711), Sf21 cell [all above, In Vivo, 13, 213-217 (1977)] andthe like are used.

As the insect, for example, larva of Bombyx mori, Drosophila, cricketand the like are used [Nature, 315, 592 (1985)].

As the animal cell, cell lines such as monkey COS-7 cell, monkey Verocell, Chinese hamster ovary (CHO) cell, dhfr gene-deficient CHO cell,mouse L cell, mouse AtT-20 cell, mouse myeloma cell, rat GH3 cell, humanFL cell and the like, pluripotent stem cells such as iPS cell, ES celland the like of human and other mammals, and primary cultured cellsprepared from various tissues are used. Furthermore, zebrafish embryo,Xenopus oocyte and the like can also be used.

As the plant cell, suspend cultured cells, callus, protoplast, leafsegment, root segment and the like prepared from various plants (e.g.,grain such as rice, wheat, corn and the like, product crops such astomato, cucumber, egg plant and the like, garden plants such ascarnation, Eustoma russeffianum and the like, experiment plants such astobacco, arabidopsis thaliana and the like) are used.

All the above-mentioned host cells may be haploid (monoploid), orpolyploid (e.g., diploid, triploid, tetraploid and the like). In theconventional mutation introduction methods, mutation is, in principle,introduced into only one homologous chromosome to produce a heterologousgeno-type. Therefore, the desired feature is not expressed unless it isa dominant mutation, and making it homologous inconveniently requireslabor and time. In contrast, according to the present invention, sincemutations can be introduced into all alleles on the homologouschromosome in the genome, desired feature can be expressed in a singlegeneration even in the case of recessive mutation (FIG. 10), which isextremely useful since the problem of the conventional method can besolved.

An expression vector can be introduced by a known method (e.g., lysozymemethod, competent method, PEG method, CaCl₂ coprecipitation method,electroporation method, the microinjection method, the particle gunmethod, lipofection method, Agrobacterium method and the like) accordingto the kind of the host.

Escherichia coli can be transformed according to the methods describedin, for example, Proc. Natl. Acad. Sci. USA, 69, 2110 (1972), Gene, 17,107 (1982) and the like.

A vector can be introduced into the genus Bacillus according to themethods described in, for example, Molecular & General Genetics, 168,111 (1979) and the like.

A vector can be introduced into a yeast according to the methodsdescribed in, for example, Methods in Enzymology, 194, 182-187 (1991),Proc. Natl. Acad. Sci. USA, 75, 1929 (1978) and the like.

A vector can be introduced into an insect cell and an insect accordingto the methods described in, for example, Bio/Technology, 6, 47-55(1988) and the like.

A vector can be introduced into an animal cell according to the methodsdescribed in, for example, Cell Engineering additional volume 8, NewCell Engineering Experiment Protocol, 263-267 (1995) (published byShujunsha), and Virology, 52, 456 (1973).

A cell introduced with a vector can be cultured according to a knownmethod according to the kind of the host.

For example, when Escherichia coli or genus Bacillus is cultured, aliquid medium is preferable as a medium used for the culture. The mediumpreferably contains a carbon source, nitrogen source, inorganicsubstance and the like necessary for the growth of the transformant.Examples of the carbon source include glucose, dextrin, soluble starch,sucrose and the like; examples of the nitrogen source include inorganicor organic substances such as ammonium salts, nitrate salts, corn steepliquor, peptone, casein, meat extract, soybean cake, potato extract andthe like; and examples of the inorganic substance include calciumchloride, sodium dihydrogen phosphate, magnesium chloride and the like.The medium may contain yeast extract, vitamins, growth promoting factorand the like. The pH of the medium is preferably about 5-about 8.

As a medium for culturing Escherichia coli, for example, M9 mediumcontaining glucose, casamino acid [Journal of Experiments in MolecularGenetics, 431-433, Cold Spring Harbor Laboratory, New York 1972] ispreferable. Where necessary, for example, agents such as3β-indolylacrylic acid may be added to the medium to ensure an efficientfunction of a promoter. Escherichia coli is cultured at generally about15-about 43° C. Where necessary, aeration and stirring may be performed.

The genus Bacillus is cultured at generally about 30-about 40° C. Wherenecessary, aeration and stirring may be performed.

Examples of the medium for culturing yeast include Burkholder minimummedium [Proc. Natl. Acad. Sci. USA, 77, 4505 (1980)], SD mediumcontaining 0.5% casamino acid [Proc. Natl. Acad. Sci. USA, 81, 5330(1984)] and the like. The pH of the medium is preferably about 5-about8. The culture is performed at generally about 20° C.-about 35° C. Wherenecessary, aeration and stirring may be performed.

As a medium for culturing an insect cell or insect, for example, Grace'sInsect Medium [Nature, 195, 788 (1962)] containing an additive such asinactivated 10% bovine serum and the like as appropriate and the likeare used. The pH of the medium is preferably about 6.2-about 6.4. Theculture is performed at generally about 27° C. Where necessary, aerationand stirring may be performed.

As a medium for culturing an animal cell, for example, minimum essentialmedium (MEM) containing about 5-about 20% of fetal bovine serum[Science, 122, 501 (1952)], Dulbecco's modified Eagle medium (DMEM)[Virology, 8, 396 (1959)], RPMI 1640 medium [The Journal of the AmericanMedical Association, 199, 519 (1967)], 199 medium [Proceeding of theSociety for the Biological Medicine, 73, 1 (1950)] and the like areused. The pH of the medium is preferably about 6-about 8. The culture isperformed at generally about 30° C.-about 40° C. Where necessary,aeration and stirring may be performed.

As a medium for culturing a plant cell, for example, MS medium, LSmedium, B5 medium and the like are used. The pH of the medium ispreferably about 5-about 8. The culture is performed at generally about20° C.-about 30° C. Where necessary, aeration and stirring may beperformed.

As mentioned above, a complex of a nucleic acid sequence-recognizingmodule and a nucleic acid base converting enzyme, i.e., nucleicacid-modifying enzyme complex, can be expressed intracellularly.

An RNA encoding a nucleic acid sequence-recognizing module and/or anucleic acid base converting enzyme can be introduced into a host cellby microinjection method, lipofection method and the like. RNAintroduction can be performed once or multiple times (e.g., 2-5 times)at suitable intervals.

When a complex of a nucleic acid sequence-recognizing module and anucleic acid base converting enzyme is expressed by an expression vectoror RNA molecule introduced into the cell, the nucleic acidsequence-recognizing module specifically recognizes and binds to atarget nucleotide sequence in the double stranded DNA (e.g., genomicDNA) of interest and, due to the action of the nucleic acid baseconverting enzyme linked to the nucleic acid sequence-recognizingmodule, base conversion occurs in the sense strand or antisense strandof the targeted site (whole or partial target nucleotide sequence orappropriately adjusted within several hundred bases including thevicinity thereof) and a mismatch occurs in the double stranded DNA(e.g., when cytidine deaminase such as PmCDA1, AID and the like is usedas a nucleic acid base converting enzyme, cytosine on the sense strandor antisense strand at the targeted site is converted to uracil to causeU:G or G:U mismatch). When the mismatch is not correctly repaired, andwhen repaired such that a base of the opposite strand forms a pair witha base of the converted strand (T-A or A-T in the above-mentionedexample), or when another nucleotide is further substituted (e.g., U→A,G) or when one to several dozen bases are deleted or inserted duringrepair, various mutations are introduced.

As for zinc finger motif, production of many actually functional zincfinger motifs is not easy, since production efficiency of a zinc fingerthat specifically binds to a target nucleotide sequence is not high andselection of a zinc finger having high binding specificity is not easy.While TAL effector and PPR motif have a high degree of freedom of targetnucleic acid sequence recognition as compared to zinc finger motif, aproblem remains in the efficiency since a large protein needs to bedesigned and constructed every time according to the target nucleotidesequence.

In contrast, since the CRISPR-Cas system recognizes the sequence ofdouble stranded DNA of interest by a guide RNA complementary to thetarget nucleotide sequence, any sequence can be targeted by simplysynthesizing an oligoDNA capable of specifically forming a hybrid withthe target nucleotide sequence.

Therefore, in a more preferable embodiment of the present invention, aCRISPR-Cas system wherein at least one DNA cleavage ability of Cas isinactivated (CRISPR-mutant Cas) is used as a nucleic acidsequence-recognizing module.

FIG. 1 is a schematic illustration showing the double stranded DNAmodification method of the present invention using CRISPR-mutant Cas asa nucleic acid sequence-recognizing module.

The nucleic acid sequence-recognizing module of the present inventionusing CRISPR-mutant Cas is provided as a complex of an RNA moleculeconsisting of a guide RNA complementary to the target nucleotidesequence and tracrRNA necessary for recruiting mutant Cas protein, and amutant Cas protein.

The Cas protein used in the present invention is not particularlylimited as long as it belongs to the CRISPR system, and is preferablyCas9. Examples of Cas9 include, but are not limited to, Cas9 (SpCas9from Streptococcus pyogenes, Cas9 (StCas9) from Streptococcusthermophilus and the like, preferably SpCas9. As a mutant Cas used inthe present invention, either a Cas having cleavage ability of bothstrands of the double stranded DNA is inactivated, or a Cas havingnickase activity wherein only one of the cleavage ability of only one ofthe strands is inactivated, can be used. For example, in the case ofSpCas9, a Dl OA mutant wherein the 10th Asp residue is converted to anAla residue and lacking cleavage ability of a strand opposite to thestrand forming a complementary strand with a guide RNA, or H840A mutantwherein the 840th His residue is converted to an Ala residue and lackingcleavage ability of strand complementary to guide RNA, or a doublemutant thereof can be used, and another mutant Cas can be usedsimilarly.

A nucleic acid base converting enzyme is provided as a complex withmutant Cas by a method similar to the linking scheme with theabove-mentioned zinc finger and the like. Alternatively, a nucleic acidbase converting enzyme and mutant Cas can also be linked by utilizingRNA scaffold with RNA aptamers MS2F6, PP7 and the like and bindingproteins thereto. Guide RNA forms a complementary strand with the targetnucleotide sequence, mutant Cas is recruited by the attached tracrRNAand mutant Cas recognizes DNA cleavage site recognition sequence PAM(protospacer adjacent motif) (when SpCas9 is used, PAM is 3 bases of NGG(N is any base), and, theoretically, can target any position on thegenome). One or both DNAs cannot be cleaved, and, due to the action ofthe nucleic acid base converting enzyme linked to the mutant Cas, baseconversion occurs in the targeted site (appropriately adjusted withinseveral hundred bases including whole or partial target nucleotidesequence) and a mismatch occurs in the double stranded DNA. When themismatch is not correctly repaired, and when repaired such that a baseof the opposite strand forms a pair with a base of the converted strand,or when another nucleotide is further converted or when one to severaldozen bases are deleted or inserted during repair, various mutations areintroduced (see, e.g., FIG. 2).

Even when CRISPR-mutant Cas is used as a nucleic acidsequence-recognizing module, a nucleic acid sequence-recognizing moduleand a nucleic acid base converting enzyme are introduced, desirably inthe form of a nucleic acid encoding same, into a cell having a doublestranded DNA of interest, similar to when zinc finger and the like areused as a nucleic acid sequence-recognizing module.

A DNA encoding Cas can be cloned by a method similar to theabove-mentioned method for a DNA encoding a nucleic acid base convertingenzyme, from a cell producing the enzyme. A mutant Cas can be obtainedby introducing a mutation to convert an amino acid residue of the partimportant for the DNA cleavage activity (e.g., 10th Asp residue and840th His residue for Cas9, though not limited thereto) to another aminoacid, into a DNA encoding cloned Cas, by a site specific mutationinduction method known per se.

Alternatively, a DNA encoding mutant Cas can also be constructed as aDNA having codon usage suitable for expression in a host cell to beused, by a method similar to those mentioned above for a DNA encoding anucleic acid sequence-recognizing module and a DNA encoding a nucleicacid base converting enzyme, and in a combination with chemicalsynthesis or PCR method or Gibson Assembly method. For example, CDSsequence and amino acid sequence optimized for the expression of SpCas9in eukaryotic cells are shown in SEQ ID NOs: 5 and 6. In the sequenceshown in SEQ ID NO: 5, when “A” is converted to “C” in base No. 29, aDNA encoding a Dl OA mutant can be obtained, and when “CA” is convertedto “GC” in base Nos. 2518-2519, a DNA encoding an H840A mutant can beobtained.

A DNA encoding a mutant Cas and a DNA encoding a nucleic acid baseconverting enzyme may be linked to allow for expression as a fusionprotein, or designed to be separately expressed using a binding domain,intein or the like, and form a complex in a host cell viaprotein-protein interaction or protein ligation.

The obtained DNA encoding a mutant Cas and/or a nucleic acid baseconverting enzyme can be inserted into the downstream of a promoter ofan expression vector similar to the one mentioned above, according tothe host.

On the other hand, a DNA encoding guide RNA and tracrRNA can be obtainedby designing an oligoDNA sequence linking guide RNA sequencecomplementary to the target nucleotide sequence and known tracrRNAsequence (e.g.,gttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagtggcaccgagtcggtggtgctttt;SEQ ID NO: 7) and chemically synthesizing using a DNA/RNA synthesizer.

While the length of the guide RNA sequence is not particularly limitedas long as it can specifically bind to a target nucleotide sequence, forexample, it is 15-30 nucleotides, preferably 18-24 nucleotides.

While a DNA encoding guide RNA and tracrRNA can also be inserted into anexpression vector similar to the one mentioned above, according to thehost. As the promoter, pol III promoter (e.g., SNR6, SNR52, SCR1, RPR1,U6, H1 promoter etc.) and terminator (e.g., T₆ sequence) are preferablyused.

An RNA encoding mutant Cas and/or a nucleic acid base converting enzymecan be prepared by, for example, transcription to mRNA in an in vitrotranscription system known per se by using a vector encoding theabove-mentioned mutant Cas and/or DNA encoding a nucleic acid baseconverting enzyme as a template.

Guide RNA-tracrRNA can be obtained by designing an oligoDNA sequence inwhich a sequence complementary to the target nucleotide sequence andknown tracrRNA sequence are linked, and chemically synthesizing using aDNA/RNA synthesizer.

A DNA or RNA encoding mutant Cas and/or a nucleic acid base convertingenzyme, guide RNA-tracrRNA or a DNA encoding same can be introduced intoa host cell by a method similar to the above, according to the host.

Since conventional artificial nuclease accompanies Double-stranded DNAbreaks (DSB), inhibition of growth and cell death assumedly caused bydisordered cleavage (off-target cleavage) of chromosome may occur bytargeting a sequence in the genome. The effect thereof is particularlyfatal for many microorganisms and prokaryotes, and preventsapplicability. In the present invention, mutation is introduced not byDNA cleavage but by a conversion reaction of the substituent on the DNAbase (particularly deamination reaction), and therefore, drasticreduction of toxicity can be realized. In fact, as shown in thecomparison tests using a budding yeast as a host in the below-mentionedExamples, when Cas9 having a conventional type of DSB activity is used,the number of surviving cells decreases by induction of expression,whereas it was confirmed that the cells continued to grow and the numberof surviving cells increased by the technique of the present inventionusing a combination of mutant Cas and a nucleic acid base convertingenzyme in combination (FIG. 3).

The modification of the double stranded DNA in the present inventiondoes not preclude occurrence of cleavage of the double stranded DNA in asite other than the targeted site (appropriately adjusted within severalhundred bases including whole or partial target nucleotide sequence).However, one of the greatest advantages of the present invention isavoidance of toxicity by off-target cleavage, which is generallyapplicable to any species. In one preferable embodiment, therefore, themodification of the double stranded DNA in the present invention is notassociated with cleavage of DNA strand not only in a targeted site of aselected double stranded DNA but in other sites.

As shown in the below-mentioned Examples, when Cas having a nickaseactivity capable of cleaving only one of the strands of the doublestranded DNA is used as a mutant Cas (FIG. 5), the mutation introductionefficiency increases as compared to when mutant Cas which is incapableof cleaving both strands is used. Therefore, for example, besides anucleic acid sequence-recognizing module and a nucleic acid baseconverting enzyme, linking a protein having a nickase activity, therebycleaving only a DNA single strand in the vicinity of the targetnucleotide sequence, the mutation introduction efficiency can beimproved while avoiding the strong toxicity of DSB.

Furthermore, a comparison of the effects of mutant Cas having two kindsof nickase activity of cleaving different strand reveals that using oneof the mutant Cas results in mutated sites accumulating near the centerof the target nucleotide sequence, and using another mutant Cas resultsin various mutations which are randomly introduced into region ofseveral hundred bases from the target nucleotide sequence (FIG. 6).Therefore, by selecting a strand to be cleaved by the nickase, amutation can be introduced into a particular nucleotide or nucleotideregion at a pinpoint, or various mutations can be randomly introducedinto a comparatively wide range, which can be properly adopted accordingto the object. For example, when the former technique is applied togenetically diseased iPS cell, a cell transplantation therapeutic agentwith a lower risk of rejection can be produced by repairing mutation ofthe pathogenic gene in an iPS cell produced from the patients' own cell,and differentiating the cell into the somatic cell of interest.

Example 7 and the subsequent Examples mentioned below show that amutation can be introduced into a particular nucleotide almost at apinpoint. For pinpoint introduction of a mutation into a desirednucleotide, the target nucleotide sequence should be set to show certainregularity of the positional relationship between a nucleotide desiredto be introduced with a mutation and the target nucleotide sequence.CRISPR-Cas system is used as a nucleic acid sequence-recognizing moduleand AID is used as a nucleic acid base converting enzyme, the targetnucleotide sequence can be designed such that C (or G in the oppositestrand) into which a mutation is desired to be introduced is at 2-5nucleotides from the 5′-end of the target nucleotide sequence. Asmentioned above, the length of the guide RNA sequence can beappropriately determined to fall between 15-30 nucleotides, preferably18-24 nucleotides. Since the guide RNA sequence is a sequencecomplementary to the target nucleotide sequence, the length of thetarget nucleotide sequence changes when the length of the guide RNAsequence is changed; however, the regularity that a mutation is likelyto be introduced into C or G at 2-5 nucleotides from the 5′-endirrespective of the length of the nucleotide, is maintained (FIG. 12).Therefore, by appropriately determining the length of the targetnucleotide sequence (guide RNA as a complementary strand thereof), thesite of a base into which a mutation can be introduced can be shifted.As a result, restriction by DNA cleavage site recognition sequence PAM(NGG) can also be removed, and the degree of freedom of mutationintroduction becomes higher.

As shown in the below-mentioned Examples, when sequence-recognizingmodules are produced corresponding to a plurality of target nucleotidesequences in proximity, and simultaneously used, the mutationintroduction efficiency drastically increases relative to when a singlenucleotide sequence is used as a target (FIG. 7). As the effect thereof,similar mutation induction is realized even when both target nucleotidesequences partly overlap or when the both are apart by about 600 bp. Itcan occur when both target nucleotide sequences are in the samedirection (target nucleotide sequences are present on the same strand)(FIG. 7), and when they are opposed (target nucleotide sequences arepresent on each strand of double stranded DNA) (FIG. 4).

As shown in the below-mentioned Examples, the genome sequencemodification method of the present invention can introduce mutation intoalmost all cells in which the nucleic acid-modifying enzyme complex ofthe present invention has been expressed, by selecting a suitable targetnucleotide sequence (FIG. 8). Thus, insertion and selection of aselection marker gene, which are essential in the conventional genomeediting, are not necessary. This dramatically facilitates and simplifiesgene manipulation and extends the applicability to crop breeding and thelike since a recombinant organism with foreign DNA is not produced.

Since the genome sequence modification method of the present inventionshows extremely high mutation introduction efficiency, and does notrequire selection by markers, a plurality of DNA regions at completelydifferent positions can be modified as targets (FIG. 9). Therefore, inone preferable embodiment of the present invention, two or more kinds ofnucleic acid sequence-recognizing modules that specifically bind todifferent target nucleotide sequences (which may be present in onetarget gene of interest, or two or more different target genes ofinterest, which may be present on the same chromosome or differentchromosomes) can be used. In this case, each one of these nucleic acidsequence-recognizing modules and nucleic acid base converting enzymeform a nucleic acid-modifying enzyme complex. Here, a common nucleicacid base converting enzyme can be used. For example, when CRISPR-Cassystem is used as a nucleic acid sequence-recognizing module, a commoncomplex of a Cas protein and a nucleic acid base converting enzyme(including fusion protein) is used, and two or more kinds of chimericRNAs of tracrRNA and each of two or more guide RNAs that respectivelyform a complementary strand with a different target nucleotide sequencesare produced and used as guide RNA-tracrRNAs. On the other hand, whenzinc finger motif, TAL effector and the like are used as nucleic acidsequence-recognizing modules, for example, a nucleic acid baseconverting enzyme can be fused with a nucleic acid sequence-recognizingmodule that specifically binds to a different target nucleotide.

To express the nucleic acid-modifying enzyme complex of the presentinvention in a host cell, as mentioned above, an expression vectorcontaining a DNA encoding the nucleic acid-modifying enzyme complex, oran RNA encoding the nucleic acid-modifying enzyme complex is introducedinto a host cell. For efficient introduction of mutation, it isdesirable to maintain an expression of nucleic acid-modifying enzymecomplex at a given level or above for not less than a given period. Fromsuch aspect, introduction of an expression vector autonomouslyreplicatable in a host cell (plasmid etc.) is reliable. However, sincethe plasmid etc. are foreign DNAs, they are preferably removed rapidlyafter successful introduction of mutation. Therefore, although it variesdepending on the kind of host cell and the like, for example, theintroduced plasmid is desirably removed from the host cell after a lapseof 6 hr-2 days from the introduction of an expression vector by usingvarious plasmid removal methods which are well known in the art.

Alternatively, as long as sufficient expression of a nucleicacid-modifying enzyme complex for the introduction of mutation isachieved, it is also preferable to introduce mutation into the targetdouble stranded DNA of interest by transient expression by using anexpression vector without autonomous replicatability in a host cell(e.g., vector lacking replication origin that functions in a host celland/or gene encoding protein necessary for replication etc.) or RNA.

Expression of target gene is suppressed while the nucleic acid-modifyingenzyme complex of the present invention is expressed in a host cell toperform a nucleic acid base conversion reaction. Therefore, it wasdifficult to directly edit a gene essential for the survival of the hostcell as a target gene (result in side effects such as growth inhibitionof host, unstable mutation introduction efficiency, mutation of sitedifferent from target and the like). In the present invention, directediting of an essential gene has been successfully and efficientlyrealized by causing a nucleic acid base conversion reaction at a desiredstage, and transiently expressing the nucleic acid-modifying enzymecomplex of the present invention in a host cell for a period necessaryfor stabilizing the modification of the targeted site. While the periodnecessary for a nucleic acid base conversion reaction and stabilizingthe modification of the targeted site varies depending on the kind ofthe host cell, culture conditions and the like, host cells of 2-20generations are generally considered to be necessary. For example, whenthe host cell is a yeast or bacterium (e.g., Escherichia coli),expression of a nucleic acid-modifying enzyme complex needs to beinduced for 5-10 generations. Those of ordinary skill in the art canappropriately determine a preferable expression induction period basedon the doubling time of the host cell under culture conditions used. Forexample, when a budding yeast is subjected to liquid culture in a 0.02%galactose inducer medium, the expression induction period is, forexample, 20-40 hr. The expression induction period of the nucleic acidencoding the nucleic acid-modifying enzyme complex of the presentinvention may be extended beyond the above-mentioned “period necessaryfor establishing the modification of the targeted site” to the extentnot causing side effects to the host cell.

As a means for transiently expressing the nucleic acid-modifying enzymecomplex of the present invention at a desired stage for a desiredperiod, a method comprising producing a construct (expression vector)containing a nucleic acid encoding the nucleic acid-modifying enzymecomplex (a DNA encoding a guide RNA-tracrRNA and a DNA encoding a mutantCas and nucleic acid base substitution enzyme in the case of CRISPR-Cassystem), in a manner that the expression period can be controlled, andintroducing the construct into a host cell can be used. The “manner thatthe expression period can be controlled” is specifically, for example, anucleic acid encoding the nucleic acid-modifying enzyme complex of thepresent invention placed under regulation of an inducible regulatoryregion. While the “inducible regulatory region” is not particularlylimited, it is, for example, an operon of a temperature sensitive (ts)mutation repressor and an operator regulated thereby in microorganismcells of bacterium (e.g., Escherichia coli), yeast and the like.Examples of the ts mutation repressor include, but are not limited to,ts mutation of cl repressor from λ phage. In the case of λ phage clrepressor (ts), it is linked to an operator to suppress expression ofgene in the downstream at not more than 30° C. (e.g., 28° C.). At a hightemperature of not less than 37° C. (e.g., 42° C.), it is dissociatedfrom the operator to allow for induction of gene expression (FIGS. 13and 14). Therefore, the period when the expression of the target gene issuppressed can be minimized by culturing a host cell introduced with anucleic acid encoding nucleic acid-modifying enzyme complex generally atnot more than 30° C., raising the temperature to not less than 37° C. atan appropriate stage, performing culture for a given period to carry outa nucleic acid base conversion reaction and, after introduction ofmutation into the target gene, rapidly lowering the temperature to notmore than 30° C. Thus, even when an essential gene for the host cell istargeted, it can be efficiently edited while suppressing the sideeffects (FIG. 15).

When temperature sensitive mutation is utilized, for example, atemperature sensitive mutant of a protein necessary for autonomousreplication of a vector is included in a vector containing a DNAencoding the nucleic acid-modifying enzyme complex of the presentinvention. As a result, autonomous replication becomes impossiblerapidly after expression of the nucleic acid-modifying enzyme complex,and the vector naturally falls off during the cell division. Examples ofthe temperature sensitive mutant protein include, but are not limitedto, a temperature sensitive mutant of Rep101 ori necessary for thereplication of pSC101 ori. Rep101 ori (ts) acts on pSC101 ori to enableautonomous replication of plasmid at not more than 30° C. (e.g., 28°C.), but loses function at not less than 37° C. (e.g., 42° C.), andplasmid cannot replicate autonomously. Therefore, a combined use with clrepressor (ts) of the above-mentioned A phage simultaneously enablestransient expression of the nucleic acid-modifying enzyme complex of thepresent invention, and removal of the plasmid.

On the other hand, when a higher eukaryotic cell such as animal cell,insect cell, plant cell and the like is used as a host cell, a DNAencoding the nucleic acid-modifying enzyme complex of the presentinvention is introduced into a host cell under regulation of aninducible promoter (e.g., metallothionein promoter (induced by heavymetal ion), heat shock protein promoter (induced by heat shock),Tet-ON/Tet-OFF system promoter (induced by addition or removal oftetracycline or a derivative thereof), steroid-responsive promoter(induced by steroid hormone or a derivative thereof) etc.), theinduction substance is added to the medium (or removed from the medium)at an appropriate stage to induce expression of the nucleicacid-modifying enzyme complex, culture is performed for a given periodto carry out a nucleic acid base conversion reaction and, introductionof mutation into the target gene, transient expression of the nucleicacid-modifying enzyme complex can be realized.

In Prokaryotic cells such as Escherichia coli and the like, induciblepromoters can also be used. Examples of such inducible promotersinclude, but are not limited to, lac promoter (induced by IPTG), cspApromoter (induced by cold shock), araBAD promoter (induced by arabinose)and the like.

Alternatively, the above-mentioned inducible promoters can also beutilized as a vector removal mechanism when higher eukaryotic cells suchas animal cell, insect cell, plant cell and the like are used as a hostcell. That is, a vector is loaded with a replication origin that canfunction in a host cell, and a nucleic acid encoding a protein necessaryfor replication thereof (e.g., SV40 ori and large T antigen, oriP andEBNA-1 etc. for animal cells), and the expression of the nucleic acidencoding the protein is regulated by the above-mentioned induciblepromoter. As a result, while the vector is autonomously replicatable inthe presence of an induction substance, when the induction substance isremoved, autonomous replication does not occur, and the vector naturallyfalls off during cell division (conversely, autonomous replicationbecomes impossible by the addition of tetracycline and doxycycline inthe case of Tet-OFF system vector).

The present invention is explained in the following by referring toExamples, which are not to be construed as limitative.

EXAMPLE

In the below-mentioned Examples 1-6, experiments were performed asfollows.

<Cell Line, Culture, Transformation, and Expression Induction>

Budding yeast Saccharomyces cerevisiae BY4741 strain (requiring leucineand uracil) was cultured in a standard YPDA medium or SD medium with aDropout composition satisfying the auxotrophicity. The culture performedwas standing culture in an agar plate or shaking culture in a liquidmedium between 25° C. and 30° C. Transformation was performed by alithium acetate method, and selection was made in SD medium matchingappropriate auxotrophicity. For expression induction by galactose, afterpreculture overnight in an appropriate SD medium, culture in SR mediumovernight with carbon source changed from 2% glucose to 2% raffinose,and further culture in SGal medium for 3 hr to about two nights withcarbon source changed to 0.2-2% galactose were conducted for expressioninduction.

For the measurement of the number of surviving cells and Can1 mutationrate, a cell suspension was appropriately diluted, and applied on SDplate medium and SD-Arg+60 mg/I Canavanine plate medium or SD+300 mg/ICanavanine plate medium, and the number of colonies that emerge 3 dayslater was counted as the number of surviving cells. Using the number ofsurviving colonies in SD plate as the total number of cells, and thenumber of surviving colonies in Canavanine plate as the number ofresistant mutant strain, the mutation rate was calculated and evaluated.The site of mutation introduction was identified by amplifying DNAfragments containing the target gene region of each strain by a colonyPCR method, performing DNA sequencing, and performing an alignmentanalysis based on the sequence of Saccharomyces Genome Database(www.yeastgenome.org).

<Nucleic Acid Operation>

DNA was processed or constructed by any of PCR method, restrictionenzyme treatment, ligation, Gibson Assembly method, and artificialchemical synthesis. For plasmid, as the yeast/⋅Escherichia coli shuttlevector, pRS315 for leucine selection and pRS426 for uracil selectionwere used as the backbone. Plasmid was amplified by Escherichia coliline XL-10 gold or DH5α, and introduced into yeast by the lithiumacetate method.

<Construct>

For inducible expression, budding yeast pGa1/10 (SEQ ID NO: 8), which isa bidirectional promoter inducible by galactose, was used. At thedownstream of a promoter, a nuclear localization signal (ccc aag aag aagagg aag gtg; SEQ ID NO: 9(PKKKRV; encoding SEQ ID NO: 10)) was added toCas9 gene ORF from Streptococcus pyogenes having a codon optimized foreukaryon expression (SEQ ID NO: 5) and ORF (SEQ ID NO: 1 or 3) ofdeaminase gene (PmCDA1 from Petromyzon marinus or hAID from human) wasligated via a linker sequence, and expressed as a fusion protein. As alinker sequence, GS linker (repeat of ggt gga gga ggt tct; SEQ ID NO: 11(encoding GGGGS; SEQ ID NO: 12)), Flag® tag (gac tat aag gac cacgac ggagac tac aag gat cat gat att gat tac aaa gac gat gac gat aag; SEQ ID NO:13 (encoding DYKDHDGDYKDHDIDYKDDDDK; SEQ ID NO: 14)), Strep-tag® (tggagc cac ccg cag ttc gaa aaa; SEQ ID NO: 15 (encoding WSHPQFEK; SEQ IDNO: 16)), and other domains are selected and used in combination. Here,particularly, 2xGS, SH3 domain (SEQ ID NO: 17 and 18), and Flag® tagwere ligated and used. As a terminator, ADH1 terminator from buddingyeast (SEQ ID NO: 19) and Top2 terminator (SEQ ID NO: 20) were ligated.In the domain integration method, Cas9 gene ORF was ligated to SH3domain via 2xGS linker to give one protein, deaminase was added with SH3ligand sequence (SEQ ID NOs: 21 and 22) as another protein, and theywere ligated to Ga1/10 promoter on both sides. And they weresimultaneously expressed. These were incorporated into pRS315 plasmid.

In Cas9, mutation to convert the 10th aspartic acid to alanine (D10A,corresponding DNA sequence mutation a29c) and mutation to convert the840th histidine to alanine (H840A, corresponding DNA sequence mutationca2518gc) were introduced to remove cleavage ability of either side ofDNA strand.

gRNA as a chimeric structure with tracrRNA (from Streptococcus pyogenes;SEQ ID NO: 7) was disposed between SNR52 promoter (SEQ ID NO: 23) andSup4 terminator (SEQ ID NO: 24), and incorporated into pRS426 plasmid.As gRNA target base sequence, CAN1 gene ORF, 187-206(gatacgttctctatggagga; SEQ ID NO: 25) (target 1), 786-805(ttggagaaacccaggtgcct; SEQ ID NO: 26) (target 3), 793-812(aacccaggtgcctggggtcc; SEQ ID NO: 27) (target 4), 563-582(ttggccaagtcattcaattt; SEQ ID NO: 28) (target 2) and complementarystrand sequence of 767-786 (ataacggaatccaactgggc; SEQ ID NO: 29) (target5r) were used. For simultaneous expression of a plurality of targets,using a sequence from a promoter to a terminator as one set, and aplurality of the sets were incorporated into the same plasmid. They wereintroduced into cells along with Cas9-deaminase expression plasmid,intracellularly expressed, and a complex of gRNA-tracrRNA andCas9-deaminase was formed.

Example 1: Modification of Genome Sequence by Linking DNA SequenceRecognition Ability of CRISPR-Cas to Deaminase PmCDA1

To test the effect of genome sequence modification technique of thepresent invention by utilizing deaminase and CRISPR-Cas nucleic acidsequence recognition ability, introduction of mutation into CAN1 geneencoding canavanine transporter, whose gene deficit results incanavanine-resistance, was attempted. As gRNA, a sequence complementaryto 187-206 (target 1) of CAN1 gene ORF was used, a chimeric RNAexpression vector obtained by linking thereto tracrRNA fromStreptococcus pyogenes, and a vector expressing a protein obtained byfusing dCas9 with impaired nuclease activity by introducing mutations(D10A and H840A) into Cas9 (SpCas9) from Streptococcus pyogenes withPmCDA1 from Petromyzon marinus as deaminase were constructed, introducedinto the budding yeast by the lithium acetate method, and coexpressed.The results are shown in FIG. 2. When cultured on acanavanine-containing SD plate, only the cells subjected to introductionand expression of gRNA-tracrRNA and dCas9-PmCDA1 formed colony. Theresistant colony was picked up and the sequence of CAN1 gene region wasdetermined. As a result, it was confirmed that a mutation was introducedinto the target nucleotide sequence (target 1) and the vicinity thereof.

Example 2: Drastic Reduction of side Effects⋅Toxicity

In conventional Cas9 and other artificial nucleases (ZFN, TALEN),inhibition of growth and cell death assumedly caused by disorderedcleavage of chromosome occur by targeting a sequence in the genome. Theeffect thereof is particularly fatal for many microorganisms andprokaryotes, and prevents applicability.

Therefore, to verify the safety and cell toxicity of the genome sequencemodification technique of the present invention, a comparative test withconventional CRISPR-Cas9 was performed. Using sequences (targets 3, 4)in the CAN1 gene as gRNA target, the surviving cells were countedimmediately after the start of expression induction by galactose and at6 hr after the induction based on the colony forming ability on an SDplate. The results are shown in FIG. 3. In conventional Cas9, the growthwas inhibited and cell death was induced, which decreased the number ofsurviving cells. In contrast, by the present technique (nCas9D10A-PmCDA1), the cells could continue to grow, and the number ofsurviving cells drastically increased.

Example 3: Use of Different Linking Scheme

Whether mutation can be introduced into a targeted gene even when Cas9and deaminase are not used as a fusion protein but when a nucleicacid-modifying enzyme complex is formed via a binding domain and aligand thereof was examined. As Cas9, dCas9 used in Example 1 was usedand human AID instead of PmCDA1 was used as deaminase. SH3 domain wasfused with the former, and a binding ligand thereof was fused with thelatter to produce various constructs shown in FIG. 4. In addition,sequences (target 4,5r) in the CAN1 gene were used as gRNA targets.These constructs were introduced into a budding yeast. As a result, evenwhen dCas9 and deaminase were linked via the binding domain, mutationwas efficiently introduced into the targeted site of the CAN1 gene (FIG.4). The mutation introduction efficiency was remarkably improved byintroducing a plurality of binding domains into dCas9. The main site ofmutation introduction was 782nd (g782c) of ORF.

Example 4: High Efficiency and Changes in Mutation Pattern by Use ofNickase

In the same manner as in Example 1 except that D10A mutant nCas9 (D10A)that cleaves only a strand complementary to gRNA, or H840A mutant nCas9(H840A) that cleaves only a strand opposite to a strand complementary togRNA was used instead of dCas9, mutation was introduced into the CAN1gene, and the sequence in the CAN1 gene region of the colony generatedon a canavanine-containing SD plate was examined. It was found that theefficiency increases in the former (nCas9 (D10A)) as compared to dCas9(FIG. 5), and the mutation gathers in the center of the target sequence(FIG. 6). Therefore, this method enables pinpoint introduction ofmutation. On the other hand, it was found in the latter (nCas9 (H840A))that a plurality of random mutations were introduced into a region ofseveral hundred bases from the targeted nucleotide (FIG. 6) along withan increase in the efficiency as compared to dCas9 (FIG. 5).

Similar remarkable introduction of mutation could be confirmed even whenthe target nucleotide sequence was changed. In this genome editingsystem using CRISPR-Cas9 system and cytidine deaminase, it was confirmedas shown in Table 1 that cytosine present within the range of about 2-5bp from the 5′-side of the target nucleotide sequence (20 bp) werepreferentially deaminated. Therefore, by setting the target nucleotidesequence based on this regularity and further combining with nCas9(D10A), precise genome editing of 1 nucleotide unit is possible. On theother hand, a plurality of mutations can be simultaneously insertedwithin the range of about several hundred bp in the vicinity of thetarget nucleotide sequence by using nCas9 (H840A). Furthermore, sitespecificity may possibly be further varied by changing the linkingscheme of deaminase.

These results show that the kind of Cas9 protein can be changed properlyaccording to the object.

TABLE 1 site of main position of CAN1 sequence mutation gene ORF(SEQ ID NO:) introduction 187-206 (target 1) Gatacgttctcta c191a, g226ttggagga (25) 563-582 (target 2) Ttggccaagtcat cc567at, tcaattt (28)c567del 786-805 (target 3) Ttggagaaaccca cc795tt, and ggtgcct (26)cc796tt 793-812 (target 4) Aacccaggtgcct ggggtcc (27) 767-786 (comple-Ataacggaatcca g782c mentary strand) actgggc (29) (target 5r)

Example 5: Efficiency Increases Synergistically by Targeting a Pluralityof DNA Sequences in Proximity

Efficiency drastically increased by simultaneously using a plurality oftargets in proximity rather than a single target (FIG. 7). In fact,10-20% of cells had canavanine-resistant mutation (targets 3, 4). In theFigure, gRNA1 and gRNA2 target target 3 and target 4, respectively. Asdeaminase, PmCDA1 was used. The effect thereof was confirmed to occurnot only when the sequences partly overlapped (targets 3, 4), but alsowhen they were apart by about 600 bp (targets 1, 3). The effect wasfound both when the DNA sequences were in the same direction (targets 3,4) and opposing (targets 4, 5) (FIG. 4).

Example 6: Genetic Modification not Requiring Selection Marker

As for the cells (Targets 3, 4) in which target 3 and target 4 weretargeted in Example 5, 10 colonies were randomly selected from thosegrown on a non-selected (canavanine-free) plate (SD plate not containingLeu and Ura) and the sequences of the CAN1 gene region were determined.As a result, mutation was introduced into the targeted site of the CAN1gene in all examined colonies (FIG. 8). That is, editing can be expectedin almost all expressed cells by selecting a suitable target sequenceaccording to the present invention. Therefore, insertion of a markergene and selection, which are essential for the conventional genemanipulation, are not necessary. This dramatically facilitates andsimplifies gene manipulation and extends the applicability to cropbreeding and the like since a recombinant organism with foreign DNA isnot produced.

In the following Examples, experiment techniques shared by Examples 1-6were performed in the same manner as above.

Example 7: Simultaneous Editing of a Plurality of Sites (Different Gene)

In a general gene manipulation method, mutation of only one site isgenerally achieved by one operation due to various restrictions. Thus,whether a simultaneous mutation operation of a plurality of sites ispossible using the method of the present invention was tested.

Using the ORF of positions 3 to 22 of Ade1 gene of budding yeast BY4741strain as the first target nucleotide sequence (Ade1 target5:GTCAATTACGAAGACTGAAC; SEQ ID NO: 30), and the ORF of positions 767-786(complementary strand) of Can1 gene as the second target nucleotidesequence (Can1 target8 (786-767; ATAACGGAATCCAACTGGGC; SEQ ID NO: 29),both DNAs encoding chimeric RNAs of two kinds of gRNAs each containing anucleotide sequence complementary thereto and tracrRNA (SEQ ID NO: 7)were placed on the same plasmid (pRS426), and introduced into BY4741strain together with plasmid nCas9 D10A-PmCDA1 containing a nucleic acidencoding a fusion protein of mutant Cas9 and PmCDA1, and expressed, andintroduction of mutation into the both genes was verified. The cellswere cultured in an SD drop-out medium (uracil and leucine deficient;SD-UL) as a base, which maintains plasmid. The cells were appropriatelydiluted, applied on SD-UL and canavaine addition medium and allowed toform a colony. After 2 days of culture at 28° C., colonies wereobserved, and the incidence of red colony due to Ade1 mutation, and thesurvival rate in a canavanine medium were respectively counted. Theresults are shown in Table 2.

TABLE 2 survival rate in incidence of red Canavanine red colony andmedium colony medium (Can) Can survival rate SD-UL 0.54 ± 0.04+canavanine 0.64 ± 0.14 0.51 ± 0.15 0.31 ± 0.04

As a phenotype, the proportion of introduction of mutation into bothAde1 gene and Can1 gene was high and about 31%.

Then, a colony on an SD-UL medium was subjected to PCR amplificationfollowed by sequencing. Regions containing ORF of each of Ade1 and Can1were amplified, and sequence information of about 500 b sequencessurrounding the target sequence was obtained. To be specific, 5 redcolonies and 5 white colonies were analyzed to find conversion of the5th C of Ade1 gene ORF to G in all red colonies and the 5th C to T inall white colonies (FIG. 9). While the mutation rate of the target is100%, as the mutation rate in light of the object of gene destruction,the desired mutation rate is 50% since the 5th C needs to be changed toG to be a stop codon. Similarly, as for the Can1 gene, mutation wasconfirmed in the 782nd G of ORF in all clones (FIG. 9); however, sinceonly the mutation to C affords canavanine-resistance, the desiredmutation rate is 70%. Desired mutations in both genes weresimultaneously obtained in 40% clones (4 clones out of 10 clones) byinvestigation, and practically high efficiency was obtained.

Example 8: Editing of Polyploid Genome

Many organisms have diploid or polyploid genome. In the conventionalmutation introduction methods, mutation is, in principle, introducedinto only one homologous chromosome to produce a heterologous geno-type.Therefore, desired feature is not obtained unless it is a dominantmutation, and making it homologous requires labor and time. Thus,whether the technique of the present invention can introduce mutationinto all target alleles on the homologous chromosome in the genome wastested.

That is, simultaneous editing of Ade1 and Can1 genes was performed inbudding yeast YPH501 strain as a diploid strain. The phenotype of thesegene mutations (red colony and canavanine-resistant) is a recessivephenotype, and therefore, these phenotypes do not appear unless bothmutations of homologous gene (homologous mutation) are introduced.

Using the ORF of positions 1173-1154 (complementary strand) of Ade1 gene(Ade1 target 1: GTCAATAGGATCCCCTTTT; SEQ ID NO: 31) or of positions 3-22(Ade1 target 5: GTCAATTACGAAGACTGAAC; SEQ ID NO: 30) as the first targetnucleotide sequence, and the ORF of positions 767-786 (complementarystrand) of Can1 gene as the second target nucleotide sequence (Can1target8: ATAACGGAATCCAACTGGGC; SEQ ID NO: 29), both DNAs encodingchimeric RNAs of two kinds of gRNAs each containing a nucleotidesequence complementary thereto and tracrRNA (SEQ ID NO: 7) were placedon the same plasmid (pRS426), and introduced into BY4741 strain togetherwith plasmid nCas9 D10A-PmCDA1 containing a nucleic acid encoding afusion protein of mutant Cas9 and PmCDA1, and expressed, andintroduction of mutation into each gene was verified.

As a result of colony count, it was found that each characteristic ofphenotype could be obtained at a high probability (40% - 70%) (FIG.10A).

To confirm mutation, Ade1 target region of each of white colony and redcolony was sequenced to confirm overlapping of sequence signalsindicating heterologous mutation in the target site of white colony(FIG. 10B, upper panel, G and T signals overlap at ↓). Phenotype wasconfirmed to be absent in colony with heterologous mutation. On theother hand, homologous mutation free of overlapping signal was confirmedin red colony (FIG. 10B, lower panel, T signal at ↓).

Example 9: Genome Editing in Escherichia coli

In this Example, it is demonstrated that this technique effectivelyfunctions in Escherichia coli, which is a representative bacterium modelorganism. Particularly, conventional nuclease type genome editingtechnique is fatal for bacteria, and the application is difficult. Thus,the superiority of this technique is emphasized. In combination withyeast, which is an eukaryote model cell, it is shown that this techniqueis widely applicable to any species irrespective of prokaryon andeukaryon.

Amino acid mutation of DlOA and H840A were introduced (dCas9) intoStreptococcus pyogenes Cas9 gene containing bidirectional promoterregion, and a construct to be expressed as a fusion protein with PmCDA1via a linker sequence was constructed, and chimeric gRNAs encoding asequence complementary to each of the target nucleotide sequences wassimultaneously included in a plasmid (full-length nucleotide sequence isshown in SEQ ID NO: 32, in which sequence, a sequence complementary toeach of the target sequences is introduced into the site of n₂₀) (FIG.11A).

First, the ORF of positions 426-445 (T CAA TGG GCT AAC TAC GTT C; SEQ IDNO: 33) of Escherichia coli galK gene was introduced as a targetnucleotide sequence into a plasmid, various Escherichia coli strains(XL10-gold, DH5a, MG1655, BW25113) were transformed with the plasmid bycalcium method or electroporation method, SOC medium was added, recoveryculture was performed overnight, plasmid carrying cells were selectedfrom ampicillin-containing LB medium, and colony was formed.Introduction of mutation was verified by direct-sequence from colonyPCR. The results are shown in FIG. 11B.

Independent colony (1-3) was selected randomly, and sequence wasanalyzed. As a result, the 427-position C of ORF was converted to T(clones 2, 3) at a probability of not less than 60%, and the occurrenceof gene destruction generating a stop codon (TAA) was confirmed.

Then, with a complementary sequence (5′-GGTCCATAAACTGAGACAGC-3′; SEQ IDNO: 34) of 1530-1549 base region of rpoB gene ORF, which is an essentialgene, as a target, particular point mutation was introduced by a methodsimilar to the above-mentioned method to try to impartrifampicin-resistant function to Escherichia coli. The sequences ofcolonies selected in a nonselective medium (none), a 25 μg/ml rifampicin(Rif25) and 50 μg/ml rifampicin (Rif50)-containing medium were analyzed.As a result, it was confirmed that conversion of the 1546-position G ofORF to A introduced amino acid mutation from Asp(GAC) to Asn(AAC), andrifampicin-resistance was imparted (FIG. 11C, upper panel). A 10-folddilution series of the cell suspension after transformation treatmentwas spotted on a nonselective medium (none), a 25 μg/ml rifampicin(Rif25) and 50 μg/ml rifampicin (Rif50)-containing medium and cultured.As a result, it is estimated that rifampicin-resistant strain wasobtained at about 10% frequency (FIG. 11C, lower panel).

As shown above, by this technique, a new function can be added byparticular point mutation, rather than simple gene destruction. Thistechnique is superior since essential gene is directly edited.

Example 10: Adjustment of editing base site by gRNA length

Conventionally, the gRNA length relative to a target nucleotide sequencewas 20b as basic, and cytosine (or guanine in opposite strand) in a siteof 2-5b from the 5′-terminus thereof (15-19b upstream of PAM sequence)is used as a mutation target. Whether expression of different gRNAlength can shift the site of the base to be the target was examined(FIG. 12A).

Experimental Example performed on Escherichia coli is shown in FIG. 12B.A site containing many cytosines on Escherichia coli genome was searchedfor, and the experiment was performed using gsiA gene, which is aputative ABC-transporter. Substituted cytosine was examined whilechanging the length of the target to 24 bp, 22 bp, 20 bp, 18 bp to findthat the 898th, 899th cytosine was substituted by thymine in the case of20 bp (standard length). When the target site is longer than 20 bp, the896th and 897th cytosines were also substituted, and when the targetsite was shorter, the 900th and 901st cytosines were also substituted.In fact, the target site could be shifted by changing the length of thegRNA.

Example 11: Development of Temperature Dependent Genome Editing Plasmid

A plasmid that induces expression of the nucleic acid-modifying enzymecomplex of the present invention under high temperature conditions wasdesigned. While optimizing efficiency by limitatively controlling theexpression state, reduction of side effects (growth inhibition of host,unstable mutation introduction efficiency, mutation of site differentfrom target and the like) was aimed. Simultaneously, a simultaneous andeasy removal of plasmid after editing was intended by combining amechanism for ceasing the replication of plasmid at a high temperature.The detail of the experiment is shown below.

With temperature sensitive plasmid pSC101-Rep101 system (sequence ofpSC101 ori is shown in SEQ ID NO: 35, and sequence of temperaturesensitive Rep101 is shown in SEQ ID NO: 36) as a backbone, temperaturesensitive A repressor (c1857) system was used for expression induction.For genome editing, G113E mutation imparting RecA resistance wasintroduced into A repressor, to ensure normal function even under SOSresponse (SEQ ID NO: 37). dCas9-PmCDA1 (SEQ ID NO: 38) was ligated toRight Operator (SEQ ID NO: 39), and gRNA (SEQ ID NO: 40) was ligated tothe downstream of Left Operator (SEQ ID NO: 41) to regulate theexpression (full-length nucleotide sequence of the constructedexpression vector is shown in SEQ ID NO: 42). During culture at not morethan 30° C., transcription of gRNA and expression of dCas9-PmCDA1 aresuppressed, and the cells can grow normally. When cultured at not lessthan 37° C., transcription of gRNA and expression of dCas9-PmCDA1 areinduced, and replication of plasmid is suppressed simultaneously.Therefore, a nucleic acid-modifying enzyme complex necessary for genomeediting is transiently supplied, and plasmid can be removed easily afterediting (FIG. 13).

Specific protocol of the base substitution is shown in FIG. 14.

The culture temperature for plasmid construction is set at around 28°C., and an Escherichia coli colony retaining the desired plasmid isfirst established. Then, the colony is directly used, or after plasmidextraction when the strain is changed, transformation with the targetstrain is performed again, and the obtained colony is used. Liquiudculture at 28° C. is performed overnight. Thereafter, the colony isdiluted with the medium, induction culture is performed at 42° C. for 1hr to overnight, the cell suspension is appropriately diluted and spreador spotted on a plate to acquire a single colony.

As a verification experiment, point mutation introduction into essentialgene rpoB was performed. When rpoB, which is one of the RNApolymerase-constituting factors, is deleted or its function is lost, theEscherichia coli will not survive. On the other hand, it is known thatresistance to antibiotic rifampicin (Rif) is acquired when pointmutation is entered at a particular site. Therefore, aiming at suchintroduction of point mutation, a target site is selected and assay wasperformed.

The results are shown in FIG. 15. In the upper left panel, the leftshows an LB (chloramphenicol addition) plate, and the right shows arifampicin-added LB (chloramphenicol addition) plate, and samples withor without chloramphenicol were prepared and cultured at 28° C. or 42°C. When cultured at 28° C., the rate of Rif resistance is low; however,when cultured at 42° C., rifampicin resistance was obtained withextremely high efficiency. When the colonies (non-selection) obtained onLB were sequenced by 8 colonies, the 1546th guanine (G) was substitutedby adenine (A) in not less than 60% of the strain cultured at 42° C.(lower and upper left panels). It is clear that the base is alsocompletely substituted in actual sequence spectrum (lower right panel).

Similarly, base substitution of galK, which is one of the factorsinvolved in the galactose metabolism, was performed. Since metabolism of2-deoxy-galactose (2DOG), which is an analogue of galactose, by galK isfatal to Escherichia coli, this was used as a selection method. Targetsite was set such that missense mutation is induced in target 8, andthat stop codon is entered in target 12 (FIG. 16 lower right).

The results are shown in FIG. 16. In the upper left and lower leftpanels, the left shows an LB (chloramphenicol addition) plate, and theright shows a 2-DOG-added LB (chloramphenicol addition) plate, andsamples with or without chloramphenicol were prepared and cultured at28° C. or 42° C. In target 8, colony was produced only slightly on a2-DOG addition plate (upper left panel), 3 colonies on LB (red frame)were sequenced to determine that the 61st cytosine (C) was substitutedby thymine (T) in all colonies (upper right). This mutation is assumedto be insufficient to lose function of galK. On the other hand, intarget 12, colony was obtained on 2-DOG addition plate by culture at 28°C. and 42° C. (lower left panel). 3 colonies on LB were sequenced todetermine that the 271st cytosine was substituted by thymine in allcolonies (lower right). It was shown that mutation can be alsointroduced stably and highly efficiently in such different targets.

The contents disclosed in any publication cited herein, includingpatents and patent applications, are hereby incorporated in theirentireties by reference, to the extent that they have been disclosedherein.

This application is based on patent application Nos. 2014-43348 and2014-201859 filed in Japan (filing dates: Mar. 5, 2014 and Sep. 30,2014, respectively), the contents of which are incorporated in fullherein.

INDUSTRIAL APPLICABILITY

The present invention makes it possible to safely introduce sitespecific mutation into any species without insertion of a foreign DNA ordouble-stranded DNA breaks. It is also possible to set a wide range ofmutation introduction from a pin point of one base to several hundredbases, and the technique can also be applied to topical evolutioninduction by introduction of random mutation into a particularrestricted region, which has been almost impossible heretofore, and isextremely useful.

What is claimed is:
 1. A method of modifying a targeted site of a doublestranded DNA, comprising: contacting said double stranded DNA with atleast one complex which comprises (i) a nucleic acid base convertingenzyme linked to (ii) a nucleic acid sequence-recognizing module thatspecifically binds to a target nucleotide sequence in the targeted siteof the double stranded DNA, thereby to convert one or more nucleotidesin the targeted site to one or more different nucleotides or to deleteone or more nucleotides in the targeted site or to insert one or morenucleotides into said targeted site, without introducing a double strandbreak (DSB) in said double stranded DNA in the targeted site, whereinthe nucleic acid sequence-recognizing module is a CRISPR-Cas system, andwherein the CRISPR-Cas system comprises a nickase protein.
 2. The methodof claim 1 which comprises contacting the double stranded DNA with twoor more complexes that each comprise a nucleic sequence-recognizingmodule that specifically binds to a different target nucleotidesequence.
 3. The method of claim 2, wherein the different targetnucleotide sequences are present in different genes.
 4. The method ofclaim 1, wherein the nucleic acid base converting enzyme is a deaminase.5. The method of claim 4, wherein the deaminase is a cytidine deaminase.6. The method of claim 1, wherein the step of contacting comprisesintroducing a nucleic acid encoding the at least one complex into a cellwhich comprises the double stranded DNA.
 7. The method of claim 6,wherein the cell is a prokaryotic cell, an eukaryotic cell, a microbialcell, a plant cell, an insect cell, an animal cell, a vertebrate cell,or a mammalian cell.
 8. A method of modifying a targeted site in doublestranded genomic DNA in each of two or more targeted alleles onhomologous chromosomes in a polyploid cell, the method comprising:contacting said double stranded genomic DNA of the polyploid cell withat least one complex which comprises (i) a nucleic acid base convertingenzyme linked to (ii) a nucleic acid sequence-recognizing module thatspecifically binds to a target nucleotide sequence in the targeted sitein the double stranded genomic DNA in each of said two or more targetedalleles on homologous chromosomes in the polyploid cell, thereby toconvert one or more nucleotides in said targeted site in the doublestranded genomic DNA in each of said two or more targeted alleles onhomologous chromosomes to one or more different nucleotides, or todelete one or more nucleotides in said targeted site in the doublestranded genomic DNA in each of said two or more targeted alleles onhomologous chromosomes, or to insert one or more nucleotides into saidtargeted site in the double stranded genomic DNA in each of said two ormore targeted alleles on homologous chromosomes, without introducing adouble strand break (DSB) in said double stranded genomic DNA, whereinthe nucleic acid sequence-recognizing module is a CRISPR-Cas system, andwherein the CRISPR-Cas system comprises a nickase protein.
 9. The methodof claim 6, wherein the step of introducing the nucleic acid encodingthe at least one complex into the cell comprises introducing anexpression vector comprising the nucleic acid encoding the at least onecomplex into the cell, wherein the nucleic acid is under regulation ofan inducible regulatory region, the method further comprising a step ofinducing expression of the nudeic acid for an expression period tostabilize the conversion of one or more nucleotides in the targeted siteto one or more different nucleotides, or the deletion of one or morenucleotides, or the insertion of one or more nucleotides into saidtargeted site in the double stranded DNA
 10. The method of claim 9,wherein the target nucleotide sequence in the targeted site in thedouble stranded DNA is present in a gene essential for survival of thecell.
 11. A nudeic acid-modifying enzyme complex, comprising: a nucleicacid base converting enzyme, linked to (ii) a nucleic acidsequence-recognizing module that specifically binds to a targetnucleotide sequence in a targeted site of a double stranded DNA, whereinthe nucleic acid sequence-recognizing module is a CRISPR-Cas systemcomprising either a Cas protein that is incapable of introducing adouble strand break (DSB) in double stranded DNA or a Cas protein inwhich cleavage activity for only one strand of double stranded DNA hasbeen inactivated, and wherein the complex is capable of converting oneor more nucleotides in the targeted site to one or more othernudeotides, or is capable of deleting one or more nucleotides, or iscapable of inserting one or more nucleotides into said targeted site,without introducing a double strand break (DSB) in double stranded DNAin the targeted site.
 12. A nucleic add encoding the nucleicacid-modifying enzyme complex of claim
 11. 13. The method of claim 1,wherein the nickase protein is a Cas9 D10A mutant nickase protein(nCas9(D10A)).
 14. The method of claim 1, wherein the nickase protein isa Cas9 H840A mutant nickase protein (nCas9(H840A)).
 15. The nucleicacid-modifying enzyme complex of claim 11, wherein only one of two DNAcleavage abilities of the Cas protein is inactivated.
 16. A nucleic acidencoding the nucleic acid-modifying enzyme complex of claim 15.